Ludwig - Papers: Computer Science - Artificial Intelligence

Continual Learning: Fast and Slow

Quang Pham, et al. • (2022) • DOI: 10.48550/arXiv.2209.02370

According to the Complementary Learning Systems (CLS) theory~\cite{mcclelland1995there} in neuroscience, humans do effective \emph{continual learning} through two complementary systems: a fast learnin...

The Future of Continual Learning in the Era of Foundation Models: Three Key Directions

Jack Bell, et al. • (2025) • DOI: 10.48550/arXiv.2506.03320

Continual learning--the ability to acquire, retain, and refine knowledge over time--has always been fundamental to intelligence, both human and artificial. Historically, different AI paradigms have ac...

Fantastic Pretraining Optimizers and Where to Find Them

Kaiyue Wen, et al. • (2025) • DOI: 10.48550/arXiv.2509.02046

AdamW has long been the dominant optimizer in language model pretraining, despite numerous claims that alternative optimizers offer 1.4 to 2x speedup. We posit that two methodological shortcomings hav...

Autonomous Code Evolution Meets NP-Completeness

Cunxi Yu, et al. • (2025) • DOI: 10.48550/arXiv.2509.07367

Large language models (LLMs) have recently shown strong coding abilities, enabling not only static code generation but also iterative code self-evolving through agentic frameworks. Recently, AlphaEvol...

Learning Universal Predictors

Jordi Grau-Moya, et al. • (2024) • DOI: 10.48550/arXiv.2401.14953

Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data. Broad exposure to different tasks leads to versatile representations enabling ge...

Large Language Models as Computable Approximations to Solomonoff Induction

Jun Wan, Lingrui Mei • (2025) • DOI: 10.48550/arXiv.2505.15784

The rapid advancement of large language models (LLMs) calls for a rigorous theoretical framework to explain their empirical success. While significant progress has been made in understanding LLM behav...

DataRater: Meta-Learned Dataset Curation

Dan A. Calian, et al. • (2025) • DOI: 10.48550/arXiv.2505.17895

The quality of foundation models depends heavily on their training data. Consequently, great efforts have been put into dataset curation. Yet most approaches rely on manual tuning of coarse-grained mi...

Diffusion Beats Autoregressive in Data-Constrained Settings

Mihir Prabhudesai, et al. • (2025) • DOI: 10.48550/arXiv.2507.15857

Autoregressive (AR) models have long dominated the landscape of large language models, driving progress across a wide range of tasks. Recently, diffusion-based language models have emerged as a promis...

Subliminal Learning: Language models transmit behavioral traits via hidden signals in data

Alex Cloud, et al. • (2025) • DOI: 10.48550/arXiv.2507.14805

We study subliminal learning, a surprising phenomenon where language models transmit behavioral traits via semantically unrelated data. In our main experiments, a "teacher" model with some trait T (su...

Large Language Models and Emergence: A Complex Systems Perspective

David C. Krakauer, et al. • (2025) • DOI: 10.48550/arXiv.2506.11135

Emergence is a concept in complexity science that describes how many-body systems manifest novel higher-level properties, properties that can be described by replacing high-dimensional mechanisms with...

Fast and Simplex: 2-Simplicial Attention in Triton

Aurko Roy, et al. • (2025) • DOI: 10.48550/arXiv.2507.02754

Recent work has shown that training loss scales as a power law with both model size and the number of tokens, and that achieving compute-optimal models requires scaling model size and token count toge...

Intelligence at the Edge of Chaos

Shiyang Zhang, et al. • • (2025) • DOI: 10.48550/arXiv.2410.02536

We explore the emergence of intelligent behavior in artificial systems by investigating how the complexity of rule-based systems influences the capabilities of models trained to predict these rules. O...

Trends in AI Supercomputers

Konstantin F. Pilz, et al. • • (2025) • DOI: 10.48550/arXiv.2504.16026

Frontier AI development relies on powerful AI supercomputers, yet analysis of these systems is limited. We create a dataset of 500 AI supercomputers from 2019 to 2025 and analyze key trends in perform...

Continuous Thought Machines

Luke Darlow, et al. • • (2025) • DOI: 10.48550/arXiv.2505.05522

Biological brains demonstrate complex neural activity, where the timing and interplay between neurons is critical to how brains process information. Most deep learning architectures simplify neural ac...

The Diffusion Duality

Subham Sekhar Sahoo, et al. • • (2025) • DOI: 10.48550/arXiv.2506.10892

Uniform-state discrete diffusion models hold the promise of fast text generation due to their inherent ability to self-correct. However, they are typically outperformed by autoregressive models and ma...

Comment on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

C. Opus, A. Lawsen • • (2025) • DOI: 10.48550/arXiv.2506.09250

Shojaee et al. (2025) report that Large Reasoning Models (LRMs) exhibit "accuracy collapse" on planning puzzles beyond certain complexity thresholds. We demonstrate that their findings primarily refle...

Chain-of-Thought Reasoning is a Policy Improvement Operator

Hugh Zhang, David C. Parkes • • (2023) • DOI: 10.48550/arXiv.2309.08589

Large language models have astounded the world with fascinating new capabilities. However, they currently lack the ability to teach themselves new skills, relying instead on large amounts of human-gen...

General agents need world models

Jonathan Richens, et al. • • (2025) • DOI: 10.48550/arXiv.2506.01622

Are world models a necessary ingredient for flexible, goal-directed behaviour, or is model-free learning sufficient? We provide a formal answer to this question, showing that any agent capable of gene...

Reasoning with Language Model is Planning with World Model

Shibo Hao, et al. • • (2023) • DOI: 10.48550/arXiv.2305.14992

Large language models (LLMs) have shown remarkable reasoning capabilities, especially when prompted to generate intermediate reasoning steps (e.g., Chain-of-Thought, CoT). However, LLMs can still stru...

Compute-Optimal LLMs Provably Generalize Better With Scale

Marc Finzi, et al. • • (2025) • DOI: 10.48550/arXiv.2504.15208

Why do larger language models generalize better? To investigate this question, we develop generalization bounds on the pretraining objective of large language models (LLMs) in the compute-optimal regi...

Large Language Model Compression with Global Rank and Sparsity Optimization

Changhai Zhou, et al. • • (2025) • DOI: 10.48550/arXiv.2505.03801

Low-rank and sparse composite approximation is a natural idea to compress Large Language Models (LLMs). However, such an idea faces two primary challenges that adversely affect the performance of exis...

Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

Jenny Zhang, et al. • • (2025) • DOI: 10.48550/arXiv.2505.22954

Today's AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would acceler...

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Michael M. Bronstein, et al. • • (2021) • DOI: 10.48550/arXiv.2104.13478

The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to b...

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Mingjie Liu, et al. • • (2025) • DOI: 10.48550/arXiv.2505.24864

Recent advances in reasoning-centric language models have highlighted reinforcement learning (RL) as a promising method for aligning models with verifiable rewards. However, it remains contentious whe...

Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs

Alexander K. Lew, et al. • • (2023) • DOI: 10.48550/arXiv.2306.03081

Even after fine-tuning and reinforcement learning, large language models (LLMs) can be difficult, if not impossible, to control reliably with prompts alone. We propose a new inference-time approach to...

AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Yang Chen, et al. • • (2025) • DOI: 10.48550/arXiv.2505.16400

Despite recent progress in large-scale reinforcement learning (RL) for reasoning, the training recipe for building high-performing reasoning models remains elusive. Key implementation details of front...

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Yang Yue, et al. • • (2025) • DOI: 10.48550/arXiv.2504.13837

Reinforcement Learning with Verifiable Rewards (RLVR) has recently demonstrated notable success in enhancing the reasoning performance of large language models (LLMs), particularly on mathematics and ...

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Shenzhi Wang, et al. • • (2025) • DOI: 10.48550/arXiv.2506.01939

Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models (LLMs), while its mechanisms are not yet well ...

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Zafir Stojanovski, et al. • • (2025) • DOI: 10.48550/arXiv.2505.24760

We introduce Reasoning Gym (RG), a library of reasoning environments for reinforcement learning with verifiable rewards. It provides over 100 data generators and verifiers spanning multiple domains in...

Learning to Model the World with Language

Jessy Lin, et al. • • (2024) • DOI: 10.48550/arXiv.2308.01399

To interact with humans and act in the world, agents need to understand the range of language that people use and relate it to the visual world. While current agents can learn to execute simple langua...

Deep Reinforcement Learning, a textbook

Aske Plaat • • (2022) • DOI:

Deep reinforcement learning has gathered much attention recently. Impressive results were achieved in activities as diverse as autonomous driving, game playing, molecular recombination, and robotics. ...

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Andrew Zhao, et al. • • (2025) • DOI: 10.48550/arXiv.2505.03335

Reinforcement learning with verifiable rewards (RLVR) has shown promise in enhancing the reasoning capabilities of large language models by learning directly from outcome-based rewards. Recent RLVR wo...

Visual Planning: Let's Think Only with Images

Yi Xu, et al. • • (2025) • DOI: 10.48550/arXiv.2505.11409

Recent advancements in Large Language Models (LLMs) and their multimodal extensions (MLLMs) have substantially enhanced machine reasoning across diverse tasks. However, these models predominantly rely...

The Platonic Representation Hypothesis

Minyoung Huh, et al. • • (2024) • DOI: 10.48550/arXiv.2405.07987

We argue that representations in AI models, particularly deep networks, are converging. First, we survey many examples of convergence in the literature: over time and across multiple domains, the ways...

Illuminating search spaces by mapping elites

Jean-Baptiste Mouret, Jeff Clune • • (2015) • DOI: 10.48550/arXiv.1504.04909

Many fields use search algorithms, which automatically explore a search space to find high-performing solutions: chemists search through the space of molecules to discover new drugs; engineers search ...

Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features

Adityanarayanan Radh..., et al. • • (2023) • DOI: 10.48550/arXiv.2212.13881

In recent years neural networks have achieved impressive results on many technological and scientific tasks. Yet, the mechanism through which these models automatically select features, or patterns in...

Voyager: An Open-Ended Embodied Agent with Large Language Models

Guanzhi Wang, et al. • • (2023) • DOI: 10.48550/arXiv.2305.16291

We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human inter...

FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation

Liqun Ma, et al. • • (2024) • DOI: 10.48550/arXiv.2407.07093

This work presents a Fully BInarized Large Language Model (FBI-LLM), demonstrating for the first time how to train a large-scale binary language model from scratch (not the partial binary or ternary L...

Layers at Similar Depths Generate Similar Activations Across LLM Architectures

Christopher Wolfram, Aaron Schein • • (2025) • DOI: 10.48550/arXiv.2504.08775

How do the latent spaces used by independently-trained LLMs relate to one another? We study the nearest neighbor relationships induced by activations at different layers of 24 open-weight LLMs, and fi...

What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Qiyuan Zhang, et al. • • (2025) • DOI: 10.48550/arXiv.2503.24235

As enthusiasm for scaling computation (data and parameters) in the pretraining era gradually diminished, test-time scaling (TTS)—also referred to as “test-time computing”—has emerged as a prominent re...

Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

Mahmoud Assran, et al. • • (2023) • DOI: 10.48550/arXiv.2301.08243

This paper demonstrates an approach for learning highly semantic image representations without relying on hand-crafted data-augmentations. We introduce the Image-based Joint-Embedding Predictive Archi...

Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability

Zicheng Lin, et al. • • (2025) • DOI: 10.48550/arXiv.2411.19943

Mathematical reasoning tasks pose significant challenges for large language models (LLMs) because they require precise logical deduction and sequence analysis. In this work, we introduce the concept o...

A mathematical theory of semantic development in deep neural networks

Andrew M. Saxe, et al. • Proceedings of the National Academy of Sciences • (2019) • DOI: 10.1073/pnas.1820226116

An extensive body of empirical research has revealed remarkable regularities in the acquisition, organization, deployment, and neural representation of human semantic knowledge, thereby raising a fund...

Progress measures for grokking via mechanistic interpretability

Neel Nanda, et al. • • (2023) • DOI: 10.48550/arXiv.2301.05217

Neural networks often exhibit emergent behavior, where qualitatively new capabilities arise from scaling up the amount of parameters, training data, or training steps. One approach to understanding em...

Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes

Juergen Schmidhuber • • (2009) • DOI: 10.48550/arXiv.0812.4360

I argue that data becomes temporarily interesting by itself to some self-improving, but computationally limited, subjective observer once he learns to predict or compress the data in a better way, thu...

On the Emergence of Thinking in LLMs I: Searching for the Right Intuition

Guanghao Ye, et al. • • (2025) • DOI: 10.48550/arXiv.2502.06773

Recent advancements in AI, such as OpenAI’s new o models, Google’s Gemini Thinking model, and Deepseek R1, are transforming LLMs into LRMs (Large Reasoning Models). Unlike LLMs, LRMs perform thinking ...

Computer Science - Artificial Intelligence

Subcategories

Papers